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Abstract 

An efficient method for the simulation of strained heteroepitaxial growth with intermixing us- 
ing kinetic Monte Carlo is presented. The model used is based on a solid-on-solid bond counting 
formulation in which elastic effects are incorporated using a ball and spring model. While ideal- 
ized, this model nevertheless captures many aspects of heteroepitaxial growth, including nucleation, 
surface diffusion, and long range effects due elastic interaction. The algorithm combines a fast 
evaluation of the elastic displacement field with an efficient implementation of a rejection-reduced 
kinetic Monte Carlo based on using upper bounds for the rates. The former is achieved by using 
a multigrid method for global updates of the displacement field and an expanding box method for 
local updates. The simulations show the importance of intermixing on the growth of a strained film. 
Further the method is used to simulate the growth of self-assembled stacked quantum dots. 

1 Introduction 

Heteroepitaxy is the process of slow deposition of a film of one or more crystalline materials on a 
crystalline substrate of a different material. The classical examples of film/substrate combinations 
include, for example Ge/Si, InAs/GaAs, and InP/GaAs. The natural lattice spacing between the film 
and the substrate differ by a few percent resulting in elastic stress. Consequently, as the film grows the 
elastic energy builds up. The formation of 3D islands reduces the elastic energy by relieving stress at 
the cost of increased surface energy. In many cases these islands are on the order of tens of nanometers 
and are referred to as quantum dots. These quantum dot materials are of importance in construction 
of some optoelectronic devices. 

The theory of island formation is well understood in the context of Asaro-Tiller-Grinfeld instability 
[TJ[2] • This explains the instability of a single component stressed film to perturbations of the flat surface. 
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The fiat surface is unstable under sufficiently long wavelength perturbations. These perturbations grow 
through mass transport on the surface, along the free energy gradients, leading to formation of surface 
ripples that later grow into large islands with stress concentration in the valleys. The theory however 
fails to explain an experimental observation, namely in many cases the film first grows in a layer-by-layer 
fashion until it reaches a certain critical thickness. Then islands form on top of this layer (known as the 
wetting layer). This mode of growth is known as the Stranski Krastonov (SK) growth modern]. 

It was demonstrated experimentally by Cullis and co workers [3 [6], that the intermixing between the 
film and substrate is of importance. The presence of compositional non-uniformity and the dilution of 
the film by substrate material can have a stabilizing effects. Moreover the presence of vertical and lateral 
segregation in the case of alloy films [7] , can further change the nature of the instability. Subsequently, Tu 
and TersoffJS], using a continuum model developed by Spencer et al[3], argued that Stranski Krastonov 
growth is a kinetic effect and the wetting layer is not stable. A crucial aspect of their proposal is the 
intermixing that occurs between the film and the substrate. 

Another important example in which intermixing occurs on very large length scales is the self assem- 
bly of stacked quantum dots. Here the dominant mechanism of intermixing results from the deposition 
process in which several layers of one material deposited which are followed by several layers of another 
material and so on. Since the materials are not lattice matched elastic strain develops and it is ob- 
served that the film forms quantum dots. The quantum dots in different layers align themselves to self 
assemble into stacked quantum dots. Intermixing of the film material and the capping layer needs to be 
addressed in this situation. Some related work, both experimental and computational, can be found in 

Refs. ina in nana us ma- 

It is clear that the simulation of strained heteroepitaxial growth with intermixing has important 
applications. One reasonably popular approach is based on numerical solution of continuum equations. 
Indeed recent work in this direction [8] 02 [TO] HU [15] has been able to capture many aspects of the 
film growth. Another approach is based on kinetic Monte Carlo (KMC). Since KMC is based on an 
atomistic scale it is typically slower than a continuum formulation. On the other hand, KMC not only 
captures all the physical effects modeled by continuum approach but can naturally include discrete and 
stochastic effects such as nucleation, surface roughness, and intermixing. 

The purpose of this paper is to present a KMC model of strained heteroepitaxial growth with 
intermixing and an algorithm for its efficient simulation. We extended the model proposed by Orr 
et al [T7] and Lam, Lee, & Sander [T8] to include intermixing. This model is a solid-on-solid bond 
counting scheme in which elastic interaction are included using a ball and spring model. The efficient 
simulation of this model represents the major contribution of this paper. 

One computational bottle neck for the simulation of strained film growth is the repeated calculation 
of elastic field. In this paper we address this by the inclusion of intermixing into the multigrid-Fourier 
method developed by Russo & SmerekaflHEQ]. In addition we present a cleaner way to incorporate the 
substrate into the multigrid method and simplify the coarse-graining and prolongation operators. As 
it turns out, for the KMC model used here, the change in total elastic energy, when a surface atom is 
removed, is needed. It was established by Schulze & Smereka[21] that despite the long range nature of 
clastic interactions one can accurately compute this energy change by local calculations. In this work, 
we have verified that this approach also works in case of intermixing. In addition we have presented 
numerical evidence that the asymptotic expressions presented in Ref. OS] remain valid in the case of 
intermixing. 

Another computational bottleneck is the computation of the rates. In order to implement rejection- 
free KMC one must know the hopping rates for all the atoms. To accomplish this one would have to 
compute the change in elastic energy that occurs when each and every atom is removed. This would be 
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prohibitively expensive. Fortunately, it was argued in Ref. [2JJ that the current elastic energy density 
can be used to provide fairly sharp upper bounds on the change in elastic energy. These can be used 
to provide a rejection-reduced KMC algorithm. Here we demonstrate that this approach works well for 
the intermixing case. In our simulations the rejection rate is approximately between .5 and 5 percent. 

In the following sections we explain the KMC model and present our algorithm to calculate the elastic 
displacement field and elastic energy in the case of a multicomponent film. Some results are presented 
that illustrate the effectiveness of the algorithm; namely, the effects of intermixing on the morphology 
of a growing strained film and a simulation illustrating the self-assembly of stacked quantum dots. 
To the best of our knowledge these results are the first kinetic Monte Carlo simulations of strained 
heteroepitaxial growth with intermixing. 



2 Model Description 

Our model is based on the solid-on-solid model presented by Orr et al [T7] and Lam et al [TH] ■ We consider 
a semi- infinite substrate which is initially composed of a single component. For ease of exposition we 
shall refer to the substrate material as Silicon. The deposited material will be composed of a prescribed 
mixture of Silicon and Germanium. The model and algorithm will be detailed for a two component 
film but it is not limited to this. In this paper we restrict ourselves to 1+1 dimensions. The atoms in 
the crystal occupy sites on a simple cubic crystal, within the solid-on-solid framework (no overhanging 
atoms). Therefore the height of the surface is a function of the horizontal coordinate denoted by I. Each 
atom in the lattice is bonded with its nearest (4 possible) and next to nearest neighbors (4 possible) by 
bonds of strength 7. 

The elastic interactions are modeled by means of a ball and spring system. Each atom on the lattice 
is connected to its nearest and next to nearest neighbors by Hookean springs. The lateral and vertical 
springs have a spring constant of kh and the diagonal springs kn- We choose kn = corresponding 
to the isotropic case [TH]. The natural bond lengths (lattice spacing) are denoted by a ss , a sg and a ggi 
for Si-Si, Si-Ge and Ge-Si bonds respectively. If these quantities are equal there are no forces due to 
lattice mismatch and hence no elastic energy in the crystal. In general since a ss + 1 a sg ^ a gg forces do 
arise. This is addressed in detail in the next section. 

The crystal evolves by rearranging itself through motion of surface atoms. By virtue of the solid- 
on-solid assumption each surface atom is uniquely specified by £ ( the horizontal component) and this 
is what we mean when we specify the £th surface atom. The hopping rate of the £th surface atom is 
modeled by 

~AE + E Q ~ 



Re = Rq exp 



(1) 



where AE is the change in the total energy when removing the £th surface atom, R is the attempt 
frequency, kg is the Boltzmann constant and T is the absolute temperature. We write the total energy 
as 

E = E chem + W 

where E c h em is the total chemical energy and W is the total elastic energy. The chemical energy can be 
thought of as the contribution to the total energy from local interactions whereas W is the contribution 
by the long range interactions. Since we are assuming that the bond energy is the same for nearest and 
next to nearest neighbors, it follows that 

AE chem = -jN 
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where TV is the total number of nearest and next to nearest neighbors. Therefore, the change in total 
energy is given by 

AE = -jN + AW (2) 

The calculation of AW is described in the next two sections. The parameters Eq and Rq will be chosen 
to match the adatom diffusion rate to experimental values. When an atom hops it moves to another 
surface site whose horizontal coordinate changes by ±1, with equal probability. As a simplification we 
shall ignore the elastic contribution to the energy barrier for adatoms; therefore the rates we shall use 
are 

tfoexp(^g^) if iV<3 

Ri = { B (3) 

f -N 1 + AW + E Q \ 

V k^f ) 

The number 3 corresponds to the number of bonds of an adatom on a flat terrace. The adatoms 
with less than 3 bonds are assumed to hop at the same rate as other adatoms. This will reduces the 
computational time significantly. No significant change is observed in the results due to this treatement 
for the adatoms. 

Finally we deposit atoms at a rate of 

R dep = FM (4) 

where F is the deposition rate in monolayers per unit time and M is the total number of sites in the 
horizontal direction. 

The model by Lam et al [18] allowed the possibility of large hops with i? adjusted to correct for 
the large hop size. This greatly improved the computational speed but at the same time not changing 
the results too much. However in the case of intermixing we observed that the inclusion of this feature 
significantly retarded intermixing. Hence large hop strategy was not adopted in this work. 



3 The Displacement Field 

The calculation of the hopping rate of an atom involves the computation of elastic energy of the crystal 
in different configurations. The elastic energy is merely the sum total of the energy stored in each spring 
when the crystal is in mechanical equilibrium. The energy in a harmonic spring is proportional to the 
square of the change in length of the spring (relative to natural length) . This is computed in terms of 
the displacements of the atoms from a reference configuration. The computational challenge is quickly 
computing the equilibrium displacements of the atoms. 



3.1 Forces In the Reference Configuration 

The first step toward the calculation of the elastic energy is the choice of a reference configuration. The 
displacements of atoms are calculated with respect to this reference configuration. We choose this to 
be a cubic lattice, whose lattice spacing equals that of pure substrate material. This ensures that the 
substrate is devoid of any forces in the reference state. Thus two atoms in the reference configuration 
are a distance a ss apart. The natural lattice spacing however depends on the nature of the two atoms. 
The natural lattice spacing is a = a ss ,a gs , and a gg , when the two atoms are Si-Si, Ge-Si, and Ge-Ge 
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respectively. It is useful to introduce the misfit parameter, e, 

f= (a-a^ (5) 

&SS 

This typically varies from e = —0.06 to + 0.06. In the two component system Ge/Si we have two 
interactions to be considered Ge-Si and Ge-Ge and we introduce the following quantities 

\&sq &SSJ i \®qq ^ss) / n \ 

e sg = ^—2 - and e gg = '-. (6) 

&ss Uss 

To compute the forces, that arise from the misfit, on the atoms in the reference configuration let us 
focus on two nearest neighbor atoms. The force exerted by the spring on the atoms can be calculated 
to be fez, (a — a ss ) in magnitude where fcj, is the spring constant for nearest neighbor atoms. In the case 
of next to nearest neighbor interaction, the force is calculated to be ku{a — a ss ) in magnitude, to first 
order in e. The spring constant for next to nearest neighbors is fcu. Before the general formula for the 
forces in the reference configuration are given, we first present an simple example. 




Figure 1: Configuration shows the atom of interest in the center surrounded by 8 neighboring sites. The 
Green circles represent Si atoms, the Red represent Ge and Blue the empty sites. 

An Example. In order to make things clear we present a sample calculation of forces in the configuration 
shown in Fig: [TJ The forces are given by 

f-=(:), -=(»), - -=(:!;;)• e> 

The first two correspond to interactions with empty sites and the last one arises from a Ge-Ge interaction. 
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The forces from Si-Ge interactions are given by 
p-1,0 : 



K 9 



F° - 1 









K 9 



where 



r N — «, 



L^sg^ss ; 



F" 1 - 1 



NN 



F sg 

NN 
l?sg 

NN 



^D^sg^ss j 



F 1 - 1 



psg 



psg 
NN 



and F 



gg 

x x — 



JT S 9 
~ r NN 



'D^gg^ss • 



(8) 



(9) 



The General Case. The crystal is a lattice indexed by (£, j). The first index represents the horizontal 
coordinate. The crystal is periodic in this index. The second is the vertical coordinate. The crystal 
is semi-infinite in this index, i.e, j < hi where hg is the location of the surface. All sites j > hi are 
unoccupied. 

We now write down the misfit forces for the general case. The net force on an atom at site (£, j) 
in the reference state, due to its nearest and next to nearest neighbors (j + m,£ + n) where (to, n) £ 
({-1,0,1}, {-1,0,1}), is given by 



where F^™ are given by 



G ej 



m,n= — 1,0,1 



Timn 



(10) 



1.1 



P o,i 



Jt,3 



'(,3 



fl,l 



-1,0 



f-3 



't,3 




? 1,0 



f 1,0 



(11) 



F -l -1 
¥ t3 



with fl™ n defined as 



Ji,j 



f 
Ji 



mn 
ij 



j-0,-1 



p i ,-i _ 

F t3 - 



fh-1 



&i,j;n,m Q j ;m,n ^jjG s 



if (m,n) £ (±1,±1) 
otherwise. 



The connectivity matrix, (7i j. m n , is defined as 



r i if si 

0>,™ - | o Qth( 



sites (£,j) and (£ + m,j + n) both contain atoms 
otherwise. 



and the misfit matrix, eej. n , m , is defined as 

£ gg if sites (£, j) and {£ + m,j + n) both contain Ge atoms 
ce,j;n,m = \ e s g if sites (£, j) and {£ + m,j + n) contain a Ge and a Si atom 
otherwise 



(12) 



(13) 



(14) 
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Note : The misfit for a Ge/Si system is e gg = 0.04 is known. But our model demands a parameter 
e sg which is not clear. We have conducted simulations with both e sg = 0.04 and e sg = .02. The first 
choice considers a sg = a gg and the second considers a sg = 0.5a gg . The second choice seems appropriate 
if one considers the two species as hard spheres of different sizes. Preliminary simulations with the 
first choice revealed quantitative differences in the island sizes but no apparent qualitative differences in 
the nature of the instability. With the true nature of the interactions being unknown, we present only 
results corresponding to e sg = 0.02. 

3.2 Interactions 

The atoms in the reference state experience forces from their neighbors as given by Eqn. 1101 The atoms 
become displaced from the reference position to achieve mechanical equilibrium (force balance) and we 
denote the displacement of the atom at site (£,j) as (uij,Vij) T . 

Our aim is to calculate (uej, Vej) T ■ We will assume that there is a J such that the crystal is made of 
pure Si for j < J and that all sites in this region are occupied. This means that in the region j < J the 
misfit forces given by Eqn. [TOlwill be zero. Whereas these forces will be nonzero for j > J; this includes 
the deposited atoms and the regions where the deposited atoms have intermixed with the substrate 
atoms. We treat these two regions separately. In order to set a convenient language for the rest of this 
paper we will refer to these regions as the film (j > J) and substrate (j < J), not to be confused with 
the original substrate and the deposited material. Further we choose our indices such that J = for 
convenience. 

3.2.1 Interactions in the Film Region 

For the film region corresponding to the indices j > 0, the force balance gives us 

= Ftj + k L (atj ; i,o(ut + i,j - utj) + o-£ J; _i : o(^-ij - u ej )) 

+ *f (a ej , 1A (u e+w u ej ) + a Wi _ 1 , 1 (u < _ 1J+1 - ««)) 

+ - Uij) + (?e,j;-i,-i(ue-i,3-i - Uij)) 

ko 

+ ~Y ( a ^j-iA v e+i,j+i - v ij) - <^,j;-l,l(^-l,j+l - v ij)) 
ko 

+ -y (—ct,j;l,-l( v t+l,j-l ~ %') + c^-i.-i^-ij-i - V^)) (15) 



and 



= G ej + k L (cr^j ;0 ,i(%- + i - v tj ) + er^jo.-iXuij-i - v^)) 
kr> 

+ ~ %') + &£,j;-l,l(ve-l,j+l - Vtj)) 

ko 
ko 

+ — (&£,j;i,i(u£+i,j+i — uij) — aej;-i,i(ue-ij + i — u^)) 
ko 

+ ~2 ^ Uij) + &£J; — 1, — 1 {u£ — lj — l - Uij)) (16) 
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where aij; n ,m is the connectivity matrix defined in Eqnll3l This gives a linear system of equations 
that can be solved for (uij,vgj). This system interacts with the substrate region. Note that the above 
relations at j = use tt£ — 1, which are displacements in the substrate. 

3.2.2 Interactions in the Substrate Region 

Now for the region j < we first note that, this is a semi-infinite substrate. There are no forces arising 
from misfit in the substrate and requiring mechanical equilibrium gives us 



kp 

-^-{ve+i,j+i + ve-ij-i - ve+ij-i - vt-ij+i) (17) 



= k L (v£ J+ i - 2vej + vtj-i) 

+ ~2~( v i+id+i + Vi-ij+i + vt+ij-i + vt-ij-i - 4v£j) 
ko 

+ -^( u e+ij+i + u £-ij-i - uz + xj^x - ut-ij+i). (18) 

This is a homogeneous system of equations that takes a non-trivial solution due to the displacement field 
of the film (at j = ). This gives us the displacement field for j < of a relaxed substrate with given 
displacements at j — 0. The system in Eqn. [P7l and [TBI needs to be solved for j < given (ue t o>ve,o)- 
We do this by using a Fourier series in the rr-direction: 

M 



Now inserting this into Eqn. [T7l and fTSl we get 

= 2fcr,Uj* (cos £ — 1) + kr> + ^,3+1) cos £ — 2u£j) 

+i -%j-i)sin£] 

= 2fcz,(u£j + i - 2% + j-i) + fee + cos£ - 2u 5j 

+* ^ %j-i) sin £] • 

We look for a solution for the above system in the form 

and obtain a linear homogeneous system of the form 

n(a) ( |>° ) = 0. (20) 
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The above system admits non trivial solution if 

P(a) := det[n(a)] = 

giving raise to a polynomial equation in a. The polynomial P(a) is of degree 4 and has roots of the form 
(oil, 0.2, l/aii, 1/012) where |ai|, \a%\ > 1 [IH]- Since we are interested in solutions Uj = a^uo, % = a^vo 
that decay as j — » —00 we pick the two roots ai, 0:2- 

^W(j)Q-W J' ) (21) 



where Q(j) is the invertible 2x2 matrix given by 



Finally, fp and a p are the eigenvectors and eigenvalues that arise when solving the discrete equation. 
The solution of Eqn. (|17p and (fTS)) is then given by 

u ^ ) =V^Q-i( ) f E*° ^ e *« where f ^ = — V ( Ui >° ) e^ 1 (22) 
A three dimensional version of this was presented in [19] and similar approach was presented in |24j . 

4 Elastic Energy of the Crystal. 

The elastic energy can be computed using the displacement field calculated by the algorithm in the 
previous section; it is 

W= Yl l k s P ringS 2 (23) 

all springs 

where k spr i ng — k^^ku depending on whether the spring is a diagonal or lateral spring, and S is the 
change in spring length with respect to natural spring length. We rewrite W as 

W = W f + W s (24) 

where 

Wf = ^ ] ~^k S p r i n g8 and W s = ^ ' ~^kspringfi 
j>0 j<0 

where Ylj>o indicates summing all springs connected to atoms located at sites with j > 0, X^<o * s 
defined in a similar way. We can write the expression of Wf as a sum over atoms: 



w f = ^J2 w ^ ( 25 ) 



2 

j>Q 

where ifjj is the energy of all the springs connected to the atom at site (i, j). The expression for Wij 
is given at the end of this section. Since there are no forces arising from misfit in the substrate we can 
write the expression for the elastic energy of the substrate 

1 T 

W s = --uj A s u s 
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where A s is the non-positive infinite dimensional matrix representing the interaction of the atoms in 
the substrate and u s is the vector representing their displacements. It is convenient to decompose u s 
into the displacements of the atoms at j = and those for j < 0: 

u 

Uj"<0 

Since atoms in the substrate below the first layer produce no net force on each other it follows 

As ( u- <0 ) = ( 

where fo is the force on the top layer of the substrate atoms due to the substrate atoms. If we use the 
above expression in Eqn. 3] we find 

W s = --f • u 

which can be written as 

M 



W s = - \^r,{u lfi fi + vtjogt) (26) 

where 



2 



ft = kc(ut-i,o - 2u £fi + ut+i) + -^-(ue-i-i - 2utfi + + -y-(«£-i,-i - 

and ^ 

9t = k L (-Vi t0 + vt-i) + - 2ve,o + + -^-(ui-i-i - ui + x-x). 

Therefore the total energy of the crystal is given by Eqn. (|24|) along with Eqns. (|25[) and ([26]) . It should 
be pointed out that this expression could have also been obtained using summation by parts. Finally, 
as promised we present the expression for Wij 



wf * + w yy + 2 w% (27) 



with 



w xx 
13 



k L 



— (CT»,i;l,o(«i+l,j - - d) 2 + &i,j;-l,o(Ui-l,j - + d) 2 ) 

n,j;i,i(ui+i,j+i — u i: j — d) 2 + o-jj ; -i,-i(ui-ij-i — Uij + d) 2 
+ - u i,j - d) 2 + (Ti,j--l,l(ui-l,j+l - u hj + d) 2 ) 



wYF = — (o"i,j;o,i(ui,j+i - Vi,j - d) 2 + ai,j;o,-i(vi,j-i - v hjj + d] 2 ) 



y 2 

— (cij;i,i(fi+ij+i - Vi,j - d) 2 + <Tij--i,-i(vi-ij-i - Vi,j + d) 2 

+ <ri,j;i,-i(yi+ij-i - v it j + d) 2 + cr i)i ._ ljl (u i _ liJ - +1 - v it j - d) 2 ) 
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If 



■'HI 



((7i,j;-l,-l(Ui-l,j-l 



+ d)(vi-i,j-i - Vij + d) 



a, 



d)(l>. 



d) 



'h3'< 



1,1 (Ui 



J h3 



d)), 



where 

a gg — a ss Ge-Ge bonds (28) 



i gs — a ss Ge-Si bonds 



for Si-Si bonds 



5 Multigrid Fourier Algorithm 

In this section we describe a multigrid Fourier algorithm to solve the linear system, derived in the 
previous section, for the displacement field. This algorithm is quite similar to the one presented in [20] 
but there are differences, first it was extended to a multicomponcnt system, second the incorporation of 
the substrate was simplified as was the coarse graining procedure. In [20] the substrate was removed and 
replaced by effective forces. Unfortunately standard SOR (the method of Successive over Relaxation) 
is unstable when this is done and the resulting system had to be under relaxed to stabilize it. Here the 
substrate is handled in more seem less way by using Eqn. [22] to supply the boundary conditions. As we 
shall see this formulation coarse grains nicely resulting in an efficient algorithm. 

Multigrid is an efficient method for solving linear systems Au + F = where A comes from the 
discretization of a partial differential equation. A good introduction to the idea of a multigrid algorithm 
can be found in the book by Briggs [22] . Two important tools needed to implement multigrid algorithm 
are coarse graining, CF, and prolongation operators, FC. The operator CF takes data from fine grid to 
a coarse grid; if u is a vector of length N then CFu will be vector of length N/2. The operator FC takes 
data from a coarse grid to fine grid, if u is a vector of length N then FCu is vector of length 27V. If 
u is smooth then u « FC(CFu). Naturally there are many different ways to formulate these operators 
but typically coarse-graining operators are based on averaging and the prolongation operators are based 
on interpolation. In addition, a multigrid algorithm needs a course grained version of A, denoted as 
A^ 2 ). If A is N x N then A^ 2 ' will be N/2 x N/2. One natural way to construct A^ 2 ^ is to consider 
the discretization on a coarser grid. 

Our goal is to solve Au + F = 0. Suppose we have an an approximate solution, u Q . To assess its 
accuracy we can compute the residual: r = Au a + F. The relationship between the residual and the 
error, e = u u a , is 

Ae + r = (29) 

One could solve the above equation for e and then determine u using u = u Q + e, at first sight that would 
appear as much work as solving the original system. The crucial observation used by multigrid methods 
is that if an iterative solver such as Jacobi Gauss Seidal or (SOR) is used to obtain the approximate 
solution then the residual, r, and the error, e, are known to be smooth. Indeed this is why these methods 
converge so slowly. Since r is smooth little information is lost because of coarse-graining. For this reason 
multigrid methods replace Eqn. [21] by 

A( 2 )e( 2 )+r( 2 > =0 (30) 
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Once this smaller system is solved the solution then updated using u" ew = u a + FCe( 2 ). One then 
applies SOR or Jabobi on the fine scale using u™ 6 ™ as a guess. This procedure is repeated until the 
residual is sufficiently small. This represents a two level multigrid scheme. In most implementations 
this procedure is extended to many levels. 

Implementation of a multigrid method for the system given by Eqns. I15H16I could be accomplished 
with a standard multigrid method if the film profile was flat and there was no artificial boundary 
condition. The main difficulty is the formulation of coarse-grained versions of Eqns. [151 [TH and [22l 
One approach is to use algebraic multigrid methods as in [53] . The approach outlined here was presented 
in [20j in which the problem was defined on a rectangular domain using fictitious atoms. 



5.1 Fictitious atoms 

We start by defining a rectangular domain. Let j r 
Then we define our domain of computation to be 



be the vertical coordinate of the highest atom. 



n = {(i,j): l<i<M, 0<j<N}. 

where M is the period of the lattice in the horizontal direction and N > j m ax is an integer. Some of 
these sites are occupied and others are not. All sites in that are not occupied by real atoms are called 
fictitious atoms. This is illustrated in the figure (Fig: The system of equations in (fH))) and (fTB|) are 




Fictitious atoms 



Film (Ge/Si) 



Substrate (Pure Si) 



Figure 2: Computational domain, with real and fictitious atoms. The red circles represent the Ge atoms and the 
green circles the Si atoms. The null interaction with fictitious atoms is represented by dashed lines. 



now extended to fictitious atoms, simply by setting the connectivity matrix to zero at these sites. This 
is done by defining the site atom density 



1 if (i, j) is a real atom 
[0 if (i, j) is a fictitious atom 

With these definitions, the connectivity matrix can be written as 

^i,j;m,n — Pi,jPi-\-m,j-\-n- 



(31) 



(32) 
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In order to simplify the notation we introduce the following spring strength matrices 



K x 
-1,-1 


= k D /2 


K x 


= 


K x 
1,-1 


= 


K x 

-"--1,0 


= k L 


A 0,0 


= 


K x 
-"-1,0 


= k L 


K x 

"■-1,1 


= k D /2 


-"-0,1 


= 


K x 


= k D /2 



and 



"-1,-1 


= fc D /2 


-"0,-1 


= fcL 


K v 
1,-1 


= k D /2 


-"--1,0 


= 


-"0,0 


= 


-"-1,0 


= 


-1,1 


= k D /2 


-"0,1 


- k L 


1,1 


= k D /2 



Using these matrices we can rewrite (f!5)) and p^|) as: 
1 

^ Kmn a ej;mn ([ut+m,j+n - u ij] + mn[vi +mt:i+n - v ej }) - F ej = (33) 
m,n— — 1 

1 

^ ii^nC/jjmn (N+mj+n ~ W«] + mfl[«£ +mJ - + „ - Ufj]) - G# = 0. (34) 
m,n= — 1 

5.2 Coarsening and Prolongation Operations 

A key ingredient in the multigrid technique are the coarsening and prolongation operations used to map 
data between coarser and finer grids. We let L = 1, 2, . . . L g denote the scale of the grid. L — 1 denoted 
the finest scale and L g the coarsest. In our computations we will choose the number of grid points in 
the horizontal direction on the finest scale to be M = 2 P where P is an integer. Clearly then, L g < P. 
The number grid points in the horizontal direction for the other levels are 

Mr 

M L+1 = ^- L = l,2,...,L g -l (35) 

where Afi = M . In the vertical direction, we let N denoted the number of grid points on the finest 
scale. The number of grid points for the other levels is given by 

,1) L = l,2,...,L ff -l (36) 

where [ - J represents the integer part. 

5.2.1 Operators in horizontal direction 

In the case of the horizontal direction we have periodic boundary conditions and the number of grid 
points will be even for all levels. The coarse graining procedure we use is displayed schematically on 
Fig. Oand can be expressed as 

q i+1 = C x q L (37) 



N L+1 = max( 



N L + 1 
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Q O 




Level L 
Level L+1 



Figure 3: Coarsening operation in the horizontal direction with even number of grid points. The same relative 
weights are used in the prolongation, when interpolating values from the coarse to the fine grid. 

where q L is a quantity defined on the fine grid, q i+1 is defined on the coarse grid, and C x is the 
Mi + i x Mi, matrix given by 



/ 1 2 1 

1 2 1 

1 2 1 



C x 



\ 



1 2 1 



\ 1 • • • 1 2 / 

The operation to take variables from the coarse grid to the fine grid is simply 

q L = P x q L+1 where P x = 2C*J 

5.2.2 Operators in vertical direction 



(38) 



The situation in the vertical direction is slightly different mainly because the number of grid points, Nl, 
can be either even or odd. Our coarse graining strategy is shown in Fig. [4] and Fig. [5] We can write 
the coarse graining and prolongation operations as follows 



C y (\ L and q L = P y q 



L+l 
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The matrices C y and P y are denned differently for even and odd Nl. For Nl even we have 



/ 1 2 



2 1 



whereas for Nl odd we have 
/ 1 2 1 



1 2 



2 1 



P = A 



V 



1 2 1 



1 2 1 



and 



/ 2 

2 

1 1 

2 

1 1 
2 
1 



1 2 / 



1 

2 

1 1 

2 



1 2 1 



1 2 1 
1 



and 



/ 2 

2 

1 1 



2 

1 1 
2 
1 



/ 



Remark. It should be pointed out that P y is not quite equal to 2Cj as was the case in the x direction. 
In [20] it was incorrectly asserted that their prolongation operator, P, could be written as twice the 
transpose of the coarsening operator. 
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Level L 



Level L+1 



Figure 4: Coarsening operation in the vertical direction for an even number of grid points on the fine scale. The 
blue circles represent the grid outside the computational domain where the data is assumed to be zero. 







o p o o 

o o o 



Level L 



Level L+1 



Figure 5: Coarsening operation in the vertical direction for an odd number of grid points on the fine scale 



" L+1 - CF(q L ) 



5.2.3 Coarsening and Prolongation Operators in Two Dimensions 

We construct our coarse from fine operator, CF, and the fine from coarse operator CF, as the tensor 
product of the one dimensional operators. Namely 

CF = C x (g) C y and FC = P x ® P v 

Therefore if q L is a quantity defined on the grid for level L then we map to the coarser grid of level 
L+1 using 

In an analogous way we can map q L to finer grid using 

q^ 1 = FC(q L ) 

5.2.4 Coarse-grained interactions 

Here it will be outlined how to coarse-grain Eqs. (|33| and (|34[) . They will be written in terms of the 
coarse grained values of the site atom density. The site atom density, Pij, given by pip , is coarsened 
for all levels L — 2, . . . L g using CF. The coarse-grained site atom density is used to coarse-grain the 
connectivity matrix as follows 

L 



the geometric mean of the site atom densities at the two sites. The coarse-grained version of Eqs. (J33J) 
and (l34l) are written as 



22 2L ^ cfj. mn K!^ n (uf +m j +n — u^j + mn(vf +m j +n vjjfj + — 

m,n— — 1 
1 

22 2L ^ c fj;mn^mn ( V f+m,j+n — V tj + mn ( U £+m,j+n ~ U lj)) + Gfj=0 



(39) 
(40) 



m,n=— 1 



1G 



The factor 2 2 ~ 2L is typical of coarsened elliptic equations. This can be interpreted as the weakening of 
springs as several springs are replaced by a single spring. The boundary conditions at j = — 1 are given 
by fiTIty on the corresponding grid level. In other words 



%3 



Ml 



E^HSj 6 * where 



U£0 
%l 



— Y 

Mr ^ 



ue,o 



(41) 



The system of equations given by Eqs. 
as 



(|4"0|) , and (|4T|) . will for sake of convenience, be written as 
A L u L + F L = 0. (42) 



5.3 Successive Over Relaxation 

In our multigrid algorithm we shall use the method of successive over relaxations to generate approximate 
solutions. To explain our implementation it is useful to rewrite these equations as 



<:fy 



F L 



G 



(43) 



where 



a 



y ] c lj-,mnK m , 



a - 2 2 - 2L 

m,n— — 1 
1 

2 2-2L ^ ctj-mnK^mn, 

m,n= — l 



2-2L 



^ ] c lj;mnK mn TTin, 



m,n— — 1 
1 



12-2L 



c- = 2 



2-2L 



c £j;mnK mn (u e+m j +n + mnv l+ m ,j+n) 3 



and 



-,2-2L 



We have omitted writing the explicit dependence of the a and c coefficients on £ and j. The relaxation 
scheme for the above system is based on treating the right hand side as known. Let us denote by 
(( u ej) k A v ij) k ) T the value of the displacement at iteration k. Then the solution at the next iteration, 
(( u fj) k+1 j ( V ej) k+1 ) T tne i s computed by relaxing system (|43f by one SOR iteration. The SOR relaxation 
is performed as 



(uf,) k+1 

«) fc+1 



(c k x - F% - a x >£)*)/a 



Lt/j 



a yx{^lj) +1 )/ a yy 



(44) 
(45) 



where the superscript fc on and c y indicates that these terms are evaluated using the displacements 
at the feth iterate. 
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5.4 Multigrid V-cycle Implementation 

In our computations we implement the multigrid algorithm using a standard V-cycle which, for sake of 
completeness, we will now describe. The V-cycle starts with the following steps: 

Precomputation 

1. Compute F 1 using (fTU|) . 

2. Compute Nl and Ml for L = 2 to L g using (|36|) and (f35| . 

3. Coarse-grain the site atom density: = CF(p L ) for L = 2 to i 9 

4. Compute the connectivity matrix: ( af; m n ) for L = 2 to i g 

5. Initialize first guess for u guess (usually u guess — or an existing field). 

V-cycle 

For L = 1 to L g — 1 do the following 

Relax A L u L + F L = for 77 steps (77 = 2 in our calculations) 
Compute residual r L =F L + A L u L 
Coarse Grain the residual: r i+1 = CF(r L ) 
Set F i+1 = r L+1 

Set u L+1 = (This is the initial condition for the next relaxation) 
End Loop 

Solve A Ls u is + F is = by relaxation 

For L = L g — 1 to 1 do the following 

Prolong the solution on the coarse mesh: e L = FC(u i+1 ) 
Let u^ uess =e L +u L 

Relax A L u L +F L = with initial guess Ug Uess for 77 steps 
End Loop 

The accuracy of the solution is measured using the Li norm of relative residual defined as 

RGlobal = |H|2/||F||2 

The V-cycles are repeated until Rciobai < £g where Eg is a specified tolerance. 

5.5 Calculation of Elastic Energy Differences 

Our goal is given surface site, (£, hi), we wish to calculate the change in the elastic energy 

AW = W(u) - W{u a ) 

where u and u a are the displacement fields with and without the atom, respectively. Table [1] shows 
the computational time required to accomplish this task. In more detail we consider a film, with 10 
monolayers of deposition, having a profile similar to the one shown Fig. [5J We begin with an updated 
displacement field and remove an atom and compute the change of elastic energy. The atom is then 
replaced; this is done each and every surface atom. Results for the computational time are displayed 
on Table [TJ These were obtained using a 3.6 GHz Intel Pentium 4 Processor Linux Box. It should be 
pointed out that eg = 10 -2 provides accurate values to AW i.e. within 1 to 5 %. 
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Fourier-Multigrid CPU Time 


System Size 


e g = 10" a 


e g = 10" a 


e g = 10- 4 


512 


0.030 


0.065 


0.202 


1024 


0.056 


0.128 


0.420 



Table 1: Average CPU time in seconds to update the displacement field and calculate the change in 
elastic energy 

6 Local Elastic Calculations 

Even though the multigrid method greatly reduces the computational time for an update of the elastic 
displacement field it is still far too slow considering the large number of updates required during the 
course of a kinetic Monte Carlo simulation. In this section we will first examine the long range nature 
of elastic interactions and then outline a strategy for the computation of elastic energy differences using 
local updates of the displacement field. 

6.1 Long Range Nature of Elastic Interactions 

As is well known elastic interactions are long ranged and in fact for strained films they are even longer. 
To make this point clear it interesting to consider the following quantity 

AW p = W(u; n p )-W(u a ;fl a p ) 

where 

ftp := {(£, m); i — p < i < i + p, hi > m > hi — p} 

and 

n« : =n„\{(i,fci)}. 

Also in the above formula W(u; Q p ) is taken to mean the total elastic energy inside the region Q,p with 
a displacement field given by u as shown in Fig. [51 Figures [7] and [5] demonstrate the very slow decay of 
AW P -> AW by plotting (AW - AW P )/AW vs. p for two different profiles. 

These results can be further understood in the context of continuum elasticity for a film on semi- 
finite substrate - which is a reasonable approximation since the ball and spring system is a discretization 
linear elasticity. 

We take the film profile to be of the form h(x) — T+c(x) where c(x) is a smooth compactly supported 
function whose region of support includes x — 0. If one employs the small slope approximation and if a 
small amount of material is removed from the surface at (0, h(0)) then 

AW P = AW(1 + 0(T/p)) as p -> oo (46) 

where f2 p is a semi-circular region of radius p centered at (0, h(0)). For more details see [2Tj . 

6.2 Principle of Energy Localization 

Given that the goal is to update the displacement field after removing only one surface atom it might 
seem reasonable to suppose that one could locally update the displacement in the vicinity of the the 
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. o 


I: 


0. 



Atom of interest 



Fictitious atoms 



Film (Ge/Si) 
- Substrate (Pure Si) 

Fourier Boundary 



Figure 6: Computational domain for local elastic solve with p = 2 with real and fictitious atoms. The red circles 
represent the Germanium atoms and the green circles the Silicon atoms. The blue region represents the grid points 
updated by SOR and the red region represents the boundary where u = u p 



removed atom. The results above suggest otherwise, however. Nevertheless, in |21j the authors present 
an algorithm to perform local updates to the elastic fields and calculate the change in the elastic energy, 
with and without the atom, by local calculations. This is based on what is called Principle of Energy 
Localization |21j which we will now explain. 

The idea behind Energy Localization is to compute AW by replacing u a by a locally corrected field 
u° which is computed by solving Eqs. ([55]) and for atoms in the domain fl p using u as Dirichlet 
boundary data. In this way we have the following approximate displacement field for the atom-off 
configuration 

i a p for (£,j) 
i otherwise 



6 0» 



(47) 



The principle of energy localization states that 

AWi oc = W(u; Q p ) - W(u a p ;n a p ) 

will be very close to AW provided that (|47|) is a good approximation to the displacement field. This 
result is rather surprising given that AW P is a terrible approximation and the only difference between 
them is that AWi oc uses a approximation to u a . 

To assess the accuracy of (l47|) we note that since both u and u° satisfy Eqs. (f33|) and (J34J) exactly 
for points not on the boundary of £l a p then residual will be zero everywhere except for these points. In 
other words the atoms on the boundary of f2° experience a small force. With this in mind we define 
local relative residual in the region fl p +i as 

1 

Rioc = rr- max \r ej \ 

ea ss K L (£,j)ea° +1 

This formula also considers the possibility that the solution inside f2° may not satisfy Eqs. 
exactly. 



331) and 
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Size of Expanding Box Size of Expanding Bo: 



Figure 7: The upper figure shows a portion of the film along with some of the expanding boxes. The 
black line shown on the lower left hand figure shows a plot of (AW — AW P ) / AW vs p. The colored 
lines show plots of (AW — AWi oc ) / AW vs p. The blue line is for 2 SOR iterations for each box, where 
green is 10 and red is 50. The lower right hand figure presents a plot of Ri oc for results on the left. In 
both graphs the dotted line corresponds to p~ 2 

The following numerical results are consistent with the principle of energy localization 

• AWi oc is an excellent approximation to AW as p increases. 

• The local residual Ri oc decreases as p increases. 

Figs. [7] and [5] show plots of the relative error, (AW — AWi oc )/AW for two different profiles. The 
results show that the method can provide accurate values for the change in elastic energy using local 
calculations. 
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Size of Expanding Box Size of Expanding Box 



Figure 8: This is the as Fig. [7] except the film profile is different. 



In addition one can appeal to continuum theory as discussed above to gain further insight into the 
results discuss above. In the same setting for Eq. (|46|) it is established in [21] that 

Rioc = 0{p- 2 ) as p^oo (48) 

and that 

AWi oc = AW(l + 0(p- 2 )) as p^oo (49) 
6.3 Expanding Box Method 

Based on the principle of energy localization Schulze & Smcrcka 21 proposed the Expanding Box Method 
to construct a local update of the elastic displacement field that can be used to find an accurate value for 
the energy differences. This method is based on using SOR in small neighborhood of the site where the 
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Expanding Box CPU Time 


System Size 


e g = 10- a 


e g = 10- a 


512 
1024 


0.0022 (466) 
0.0022 (930) 


0.0176 (301) 
0.0106 (608) 



Table 2: Average CPU in seconds to update the displacement field and calculate the change in elastic 
energy. The number in the parentheses is the number of successful applications of the expanding box 
method. 

atom was removed. When a change is made to the crystal configuration at one site, few iterations of SOR 
have negligible impact on the solution at sites that are at a distance of more than one lattice spacing 
away. The expanding box method constructs a series of nested domains £l a p for p = 2, 3, . . . , p max and 
finds a locally corrected atom-off displacement field, u|. Two iterations of the SOR algorithm described 
in ^5.31 are used and the atom-on displacement field, u, is used as Dirichlet data for the boundary 
points of fig. If it is determined that Ri oc < ei oc , then our local update of the displacement field has 
been successful. If not we repeat the above procedure until Ri oc < ei oc - If p = p m ax is reached before 
a successful local update has been achieved we resort to a global update using the multigrid-Fourier 
method. 

To give some idea on the computation speed of this method we present some results in Table [2] for 
Pmax = 50. Results for ei oc = 10~ 4 were not included since in this case the method fails for almost all 
attempts. It is clear that the method is significantly faster than the Fourier-multigrid method. Two 
specific examples of this method are shown on Figsl7]and[8l The first example shows an case where the 
local update would be successful. In the case shown in Fig. [5] the decay is slower than the former case 
because we are near the base of a large island. In this example the expanding box method would fail 
if Pmax < 50. The red curves shown on Figs [7] and [8] are consistent with the expressions given by Eqs. 
(|48|) and (|49|) . In addition the black curves on these figures clearly demonstrate the nonlocal nature of 
the elastic energy and is consistent with Eq. (|46|) . 

In the kinetic Monte Carlo algorithm, we use to simulate strain file growth, the removed atom is 
then moved to a neighboring site. Once the atom is moved the atom-on displacement field, u, must be 
updated. This is also done with the expanding box method. It should be pointed out that the theorems 
are valid for an initial solution u that is exact (in other words a globally computed solution is needed) . 
Nevertheless, in practice this "quilt" of locally corrected solutions is found to be of sufficient accuracy 
to provide faithful energy differences; this issue is discussed in more detail in [21j . 

7 Reduced-Rejection Kinetic Monte Carlo 

When using Eqn. [3] for parameters of physical interest one finds a wide range of different rates. In this 
situation, the most efficient way to simulate kinetic Monte Carlo is to use a rejection- free implementation. 
In this case the hopping rate of every surface atom R needs to be known before an event can occur. 
Even with the expanding box method, this is too slow for practical simulations. Here we outline a 
reduced-rejection approach which reduces the number of elastic computations. 

In this method the rates, Rg, are replaced by upper bounds, where > R where R up can 
be quickly evaluated. Now the atoms are selected with rates i?" p '. Once an atom is selected the actual 
rate is computed. To compensate for the overestimate in the rate, the selected atom will then hop with 
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probability Ri / i?" p . 

This introduces the possibility of rejection whose rate depends on how closely Ri is approximated 
by R^ p ■ It is clear by looking at ([3]) that if we can extract an appropriate upper bound for the change 
in elastic energy AW then an upper bound on the rates will follow. By performing a large number of 
numerical experiments, it has been found that AW can be bounded above as follows 

AW < C{N)w e 

where N is the number of occupied neighbors of the £th surface atom and wi is is the energy stored in 
the springs attached to this atom. The function C'(N) was found by extensive experiments in [21| and 
was verified for a multi-component film with intermixing 



C(N) 



2.4 if iV = 4 

3.5 if N > 4 



It seems remarkable that C(N) is independent of the type of surface atom and type of neighboring 
atoms. The dependence on those factors is contained in wg. 
The upper bounds are now, in view of Eq. ([3]), given as 

Rg = R a exp — — it TV > 3 (50) 



k B T 

For N < 3 we use the actual rates from Eq. ([3]). 

8 The Algorithm 



The KMC algorithm is the same as that given in [21] but for the convenience of the reader we present 
it here. 

Precomputation 

1. Calculate elastic field of crystal using multigrid- Fourier method. 

2. Compute Rdep and i?" p for I = 1 to M using ([5; 



Algorithm For Evolution 

1. Select an event by choosing a uniformly distributed random number r 6 [0, Z), with Z = Rdep + 

R^ P ■ This interval represents an overestimate of the sum of rates for atoms hopping plus the 
rate of deposition. The event to which r corresponds is located using a binary tree search [25] . 

2. If the event is a deposition, locally update the height, connection arrays and attempt a local 
clastic solve; revert to a full clastic solve if the expanding box exceeds size p ma x- Update the rate 
estimates using (|50l) in the same region in which the elastic field was updated. 

3. If the event selected is a hop, then take into account elastic effects by computing the actual hopping 
rate Ri < R^ p that depends on the energy difference of the system with and without the atom. 
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(a) Make a copy of the atom-on displacement field u in the domain il Pmas . Follow the same 
procedures in Step 2 to compute the displacement field with the atom removed, u° (atom 
off). 

(b) Once the elastic field has been updated (locally or globally as necessary), calculate the energy 
barrier and actual rate R^. 

(c) Use rejection to decide whether or not to make the move. Note that the atom-off calculation 
must be performed whether or not this move is made. 

(d) If the move is rejected, no change is made to the displacement field. Return to Step 1. 

(c) If the move is accepted, a hop is made. Update the displacement field in the vacated position 
using u^. Perform a second local/global calculation in the atom's new position thereby 
updating u. 

4. Return to Step 1. One event has been completed 

9 Results 

In this section we present some results using the parameters from [18] namely, Eq — 0.53eU, Rq = 
2D /a 2 ss , D = 3.83 x 10 13 A 2 /sec, a ss = 2.73A, e gg = .04, e sg = .02. The spring constants K L = 
13.85eV/a 2 and Kr> = Kl/2. These were chosen to model the Ge/Si system. In our simulations we 
took the bond strength to be 7 = 0.37eU whereas [18] choose 7 = 0.4eU. This was done in order to 
promote the effects of intermixing. It should pointed out that the above numbers are very approximate 
and our choice of .37 eV instead of .4 eV is still well with range of physically reasonable numbers. The 
simulations presented here were conducted at a temperature of 600K. Finally we mention that we take 

ec = eioc = icr 2 . 
9.1 Example 1 

The first example outlines the effect of intermixing on the film evolution. The results arc shown in Figs. 
|9"1 and [Till In both cases pure Germanium was deposited on pure Silicon substrate at a rate of 0.8ML/s. 
Fig. [§] shows the results of such a simulation. This figure clearly shows the growth of islands arising 
from elastic interactions. It also shows the formation of valleys between islands. The valleys arise due 
stress concentration at the base of the islands. The elastic energy is lowered when the valleys form. 
During the the course of the simulation the rejection rate was initially 0.2 % at 1ML and increased to 
1.14% at 10ML. In addition, the displacement field was updated 9.53 x 10 8 times of which 1679 were 
global updates. 

Fig. [TUlon the other hand shows the same simulation except the Silicon atoms not allowed to hop. It 
is clear that the islands form sooner: in the case with no intermixing there are well defined islands after 
just 2 monolayers of depositions whereas with intermixing it is closer to 6. This is because intermixing 
will lower the effective misfit thereby weakening the elastic interactions. A closer inspection reveals that 
the pure Ge that was deposited is diluted by around 30 %. 

To further illustrate the effects of the dilution on the film evolution we present a plot of the square 
of the roughness cu 2 (the variance of the height function) as the function of the film thickness in Fig. [TT1 
The data represents the ensemble average over 10 different runs. In can be observed that intermixing 
initially suppresses growth of the roughness. However due to the valley formation the roughness in the 
intermixing case will ultimately exceed that of the no mixing case. 
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Figure 9: Pure Ge on Si : The Ge atoms are represented by red squares and the Si by green squares. 
The results show the evolution after 2, 4, 6, 8 and 10 ML deposition. 



9.2 Example 2 

In this example we deposited 15ML of pure Ge on Si at a rate of 10 ML/s and at a temperature of 
600K. This was capped by 30 ML of Si at the same flux and followed by another 15 ML of Ge and so 
on. The Ge readily forms islands on the Si substrate. When a new Ge film is grown on the capping 
layer the quantum dots align with the ones buried below. By this process they self assemble to form an 
array of stacked quantum dots as shown in Fig. 1121 
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Figure 10: Pure Ge on Si : The Ge atoms are represented by red squares and the Si by green squares. 
The results show the evolution after 2, 4, 6, 8 and 10 ML deposition. The Si atoms have been fixed by 
setting their hopping rates to zero. 



10 Summary 

In this work, we have presented a kinetic Monte Carlo model for strained heteroepitaxial growth with 
intermixing. The model is based on a solid-on-solid, bond counting formulation [26] in which elastic 
interactions are accounted for with balls and springs on cubic lattice [T7l [18] . We have also introduced 
an algorithm for the efficient simulation of this model. This algorithm is based on the work in pT9 j f20 | f2Tj 
but was extended to include intermixing and other improvements. In particular, the Fourier-multigrid 
method developed in this work has much neater way of coupling the film and the substrate as compared to 
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Amount of Film Deposited in MLs Amount of Film Deposited in MLs 



Figure 11: Roughness Plot : This figure shows the plot of uj 2 the square of the roughness against the film 
thickness in ML, averaged over 10 independent runs. The red plot shows the behavior with intermixing 
and the blue without intermixing. The figure on the right zooms into the first 5 MLs of crystal growth. 
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Figure 12: Stacked Quantum dots: Quantum dots of 15ML Ge with a capping layer of 30 ML Si 

that in |20j . In addition, the course-graining and prolongation operators have been simplified. We have 
presented numerical evidence that the principle of energy localization is valid in the case of intermixing 
and that the asymptotic results presented in [21] also remain true. Our results also indicate that the 
expanding box method works well in the case of intermixing. We have also demonstrated that the 
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formula for the upper bounds on the rates proposed by Schulze & Smereka[3T] appears to work well 
even for mixtures. 

A preliminary study on the growth of strained films with intermixing is presented. The study 
indicates the presence of an Asaro-Tiller-Grinfeld type instability of the flat strained film leading to 
the formation of quantum dots. The intermixing is found to considerably lower the strain in the film 
resulting in a stabilizing effect that delays the onset of islands. This gives rise to what Tu & Tersoff[9 
call an apparent critical thickness. Eventually this critical layer disappears as the valleys form. A 
detailed study of the effects of intermixing on the critical thickness is currently under way. The model 
also successfully predicts self assembly of stacked arrays of quantum dots. 
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